A Laplacian Eigenmaps Based Semantic Similarity Measure between Words
نویسندگان
چکیده
The measurement of semantic similarity between words is very important in many applicaitons. In this paper, we propose a method based on Laplacian eigenmaps to measure semantic similarity between words. First, we attach semantic features to each word. Second, a similarity matrix ,which semantic features are encoded into, is calculated in the original high-dimensional space. Finally, with the aid of Laplacian eigenmaps, we recalculate the similarities in the target low-dimensional space. The experiment on the Miller-Charles benchmark shows that the similarity measurement in the low-dimensional space achieves a correlation coefficient of 0.812, in contrast with the correlation coefficient of 0.683 calculated in the high-dimensional space, implying a significant improvement of 18.9%.
منابع مشابه
Coloring of DT-MRI Fiber Traces Using Laplacian Eigenmaps
We propose a novel post processing method for visualization of fiber traces from DT-MRI data. Using a recently proposed non-linear dimensionality reduction technique, Laplacian eigenmaps [3], we create a mapping from a set of fiber traces to a low dimensional Euclidean space. Laplacian eigenmaps constructs this mapping so that similar traces are mapped to similar points, given a custom made pai...
متن کاملBroadcast News Story Segmentation Using Probabilistic Latent Semantic Analysis and Laplacian Eigenmaps
This paper proposes to integrate probabilistic latent semantic analysis (PLSA) and Laplacian Eigenmaps (LE) for broadcast news story segmentation. PLSA can address synonymy and polysemy problems by exploring underlying semantic relations beneath the actual occurrences of words. LE can provide a data transformation with the advantage of preserving the original temporal structure of sentence cohe...
متن کاملSupervised Laplacian Eigenmaps with Applications in Clinical Diagnostics for Pediatric Cardiology
Electronic health records contain rich textual data which possess critical predictive information for machine-learning based diagnostic aids. However many traditional machine learning methods fail to simultaneously integrate both vector space data and text. We present a supervised method using Laplacian eigenmaps to augment existing machinelearning methods with low-dimensional representations o...
متن کاملA WordNet-based Semantic Similarity Measure Enhanced by Internet-based Knowledge
Approaches for measuring semantic similarity between words have been widely employed in various areas such as Artificial Intelligence, Linguistics, Cognitive Science and Knowledge Engineering. A new semantic similarity measure is proposed in this paper, which exploits the knowledge retrieved from a semantic network (i.e., WordNet) and the Internet. In particular, the structure information from ...
متن کاملAlgorithm for Semantic Based Similarity Measure
In a document representation model the Semanti based Similarity Measure (SBSM), is proposed. This model combines phrases analysis as well as words analysis with the use of propbank notation as background knowledge to explore better ways of documents representation for clustering. The SBSM assigns semantic weights to both document words and phrases. The new weights reflect the semantic relatedne...
متن کامل